Search Result

Select

Patent text classification based on ALBERT and bidirectional gated recurrent unit

WEN Chaodong, ZENG Cheng, REN Junwei, ZHANG Yan

Journal of Computer Applications 2021, 41 (2): 407-412. DOI: 10.11772/j.issn.1001-9081.2020050730

Abstract （645）

PDF （979KB）（772）

Save

With the rapid increase in the number of patent applications, the demand for automatic classification of patent text is increasing. Most of the existing patent text classification algorithms utilize methods such as Word2vec and Global Vectors (GloVe) to obtain the word vector representation of the text, while a lot of word position information is abandoned and the complete semantics of the text cannot be expressed. In order to solve these problems, a multilevel patent text classification model named ALBERT-BiGRU was proposed by combining ALBERT (A Lite BERT) and BiGRU (Bidirectional Gated Recurrent Unit). In this model, dynamic word vector pre-trained by ALBERT was used to replace the static word vector trained by traditional methods like Word2vec, so as to improve the representation ability of the word vector. Then, the BiGRU neural network model was used for training, which preserved the semantic association between long-distance words in the patent text to the greatest extent. In the effective verification on the patent text dataset published by State Information Center, compared with Word2vec-BiGRU and GloVe-BiGRU, the accuracy of ALBERT-BiGRU was increased by 9.1 percentage points and 10.9 percentage points respectively at the department level of patent text, and was increased by 9.5 percentage points and 11.2 percentage points respectively at the big class level. Experimental results show that ALBERT-BiGRU can effectively improve the classification effect of patent texts of different levels.

Reference | Related Articles | Metrics

Select

Malicious webpage integrated detection method based on Stacking ensemble algorithm

PIAOYANG Heran, REN Junling

Journal of Computer Applications 2019, 39 (4): 1081-1088. DOI: 10.11772/j.issn.1001-9081.2018091926

Abstract （439）

PDF （1165KB）（279）

Save

Aiming at the problems of excessive cost of resource, long detection period and low classification effect of mainstream malicious webpage detection technology, a Stacking-based malicious webpage integrated detection method was proposed, with heterogeneous classifiers integration method applying to malicious webpage detection and recognition. By extracting and analyzing the relevant factors of webpage features, and performing classification and ensemble learning, the detection model was obtained. In the detection model, the primary classifiers were constructed based on K-Nearest Neighbors (KNN) algorithm, logistic regression algorithm and decision tree algorithm respectively, and Support Vector Machine (SVM) classifier was used for the construction of secondary classifier. Compared with the traditional malicious webpage detection methods, the proposed method improves the recognition accuracy by 0.7% and obtains a high accuracy of 98.12% in the condition of low resource consumption and high velocity. The experimental results show that the detection model constructed by the proposed method can recognize malicious webpages efficiently and accurately.

Reference | Related Articles | Metrics

Select

Anti-collision algorithm for RFID based on counter and bi-slot

MO Lei, CHEN Wei, REN Ju

Journal of Computer Applications 2017, 37 (8): 2168-2172. DOI: 10.11772/j.issn.1001-9081.2017.08.2168

Abstract （525）

PDF （831KB）（335）

Save

Focusing on the problem of the binary search anti-collision algorithm in Radio Frequency IDentification (RFID) system such as many search times and large amount of communication data, a new anti-collision algorithm for RFID with counter and bi-slot was proposed based on regressive search tree algorithm and time slot algorithm, namely CBS. The tags were searched step by step according to the slot counter in tag and the collision bit information received by reader. The response tags were divided into two groups, which returned the data information to the reader in two time slots. The reader only sends the information of the highest collision bit position, and the tags only send the bits of data after the highest collision bit. Theoretical analysis and simulation results showed that compared with the traditional Regressive Binary Search (RBS) algorithm, the search times of CBS algorithm was reduced by more than 51%, and the communication data was reduced by more than 65%. CBS algorithm is superior to the commonly used anti-collision algorithms, which greatly reduces the search times and communication data, and improves the search efficiency.

Reference | Related Articles | Metrics

Select

Automatic generation of test data for extended finite state machine models based on Tabu search algorithm

REN Jun ZHAO Rui-lian LI Zheng

Journal of Computer Applications 2011, 31 (09): 2440-2443. DOI: 10.3724/SP.J.1087.2011.02440

Abstract （1417）

PDF （746KB）（479）

Save

Test case generation of EFSM (Extended Finite State Machine Models) includes test path generation and test data generation. However, nowadays most research into EFSM testing focuses on test path generation. In order to explore the automatic test generation, a test data generation method oriented to the path of EFSM models was proposed. A Tabu Search (TS) strategy was adopted to automatically generate test data, and the key factors that affect the performance of test data generation in EFSM models were analyzed. Moreover, the test generation efficiency was compared with that of Genetic Algorithm (GA). The experimental results show that the proposed method is promising and effective, and it is obviously superior to the GA in the test generation for EFSM models.